Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat/docs/cases: Covalent Bond Input #329

Open
wants to merge 6 commits into
base: dev
Choose a base branch
from

Conversation

YaoYinYing
Copy link

@YaoYinYing YaoYinYing commented Aug 29, 2024

This is one of the separated PRs from #321 .

Full PR roadmap

id purpose # commits Affected
1 Hydra-Omegaconf and Pip module 6 Code, Config, Doc
2 BFD supports and MSA parallelism fixes 5 Code, Config, Case
3 Small molecule inputs and covalent bonds 4 Code, Case, Doc

Changelog

Added

  • Support for SDF file inputs for small molecules and additional file formats.
  • Expanded set of demo cases.
  • PTM Inputs (phosphorylations, isopeptides, metal bindings, disulfide bridges, glycosylations, etc.)

Fixed

  • Implemented automatic binary search within the PATH if not explicitly specified.
  • Resolved an issue with dummy assertions of ref_atom_name_chars and problematic atoms that caused errors.
  • Addressed the None-value in TemplateAtomMaskAllZerosError to prevent incorrect warning triggers.

drop: buildin logger

fix: template try-except bugs

fix: output logs

refactor: deduplicate code

fix: hetatm input raise from smiles

fix: hetatm input raise and dialognoses

docs&cases: covalent bonds

feat: covalent bond

fix: covalent bond

add: use_3d opt for ligand
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@alephreish
Copy link

@YaoYinYing I had to bring back a couple of imports in order to make it work:

diff --git a/apps/protein_folding/helixfold3/infer_scripts/feature_processing_aa.py b/apps/protein_folding/helixfold3/infer_scripts/feature_processing_aa.py
index 603e8da..c994b0d 100644
--- a/apps/protein_folding/helixfold3/infer_scripts/feature_processing_aa.py
+++ b/apps/protein_folding/helixfold3/infer_scripts/feature_processing_aa.py
@@ -19,6 +19,7 @@ import os
 from pathlib import Path
 import pickle
 from typing import List, Mapping, Optional, Tuple
+import time

 import numpy as np
 import logging
@@ -28,7 +29,7 @@ from helixfold.data import pipeline_multimer
 from helixfold.data import pipeline_rna_multimer
 from helixfold.data import pipeline_conf_bonds, pipeline_token_feature, pipeline_hybrid
 from helixfold.data import label_utils
-
+from concurrent.futures import ProcessPoolExecutor, as_completed
 from helixfold.data.tools import utils
 
 from .preprocess import Entity, digit2alphabet
diff --git a/apps/protein_folding/helixfold3/inference.py b/apps/protein_folding/helixfold3/inference.py
index 429809b..9edc0a1 100644
--- a/apps/protein_folding/helixfold3/inference.py
+++ b/apps/protein_folding/helixfold3/inference.py
@@ -24,6 +24,7 @@ import pickle
 import pathlib
 import shutil
 import numpy as np
+import logging
 from helixfold.common import all_atom_pdb_save 
 from helixfold.data.pipeline_conf_bonds import load_ccd_dict
 from helixfold.model import config, utils
@@ -116,7 +117,7 @@ def resolve_bin_path(cfg_path: str, default_binary_name: str)-> str:

     raise FileNotFoundError(f"Could not find a proper binary path for {default_binary_name}: {cfg_path}.")
 
-def get_msa_templates_pipeline(cfg: DictConfig) -> Dict:
+def get_msa_templates_pipeline(cfg) -> Dict:
     use_precomputed_msas = True  # Assuming this is a constant or should be set globally
     
     template_searcher = hmmsearch.Hmmsearch(
diff --git a/apps/protein_folding/helixfold3/utils/model.py b/apps/protein_folding/helixfold3/utils/model.py
index 4a5b2d6..2ba6337 100644
--- a/apps/protein_folding/helixfold3/utils/model.py
+++ b/apps/protein_folding/helixfold3/utils/model.py
@@ -17,6 +17,7 @@
 import numpy as np
 import paddle
 import paddle.nn as nn
+import logging
 import io
 
 from helixfold.model import modules_all_atom

Also a side note: leave_atom_flag is currently ignored, so the user has to modify ccd_preprocessed_etkdg.pkl.gz to remove the atoms that leave upon formation of the corresponding covalent bond.

@YaoYinYing
Copy link
Author

@alephreish Hi Andrey, this PR is a cherry-pick(which means there exists some poteintal bugs) from one branch that has been definitely out-of-dated. If you are looking for a full-featured branch for you project, please consider this fork.

@alephreish
Copy link

alephreish commented Oct 30, 2024

@YaoYinYing I've seen the main branch in your fork. I personally like the current interface of helixfold - it's flexible enough, although switching to your new interface would not be a big deal. I have the feeling that this small PR does have a chance of being merged into PaddlePaddle/PaddleHelix:dev since patch-hydra diverged too much by now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants